Teradata

RETAIL: Broken Digital Journeys Demo



Introduction

This document provides a base demo script for showcasing the capabilities of CX 1-2-3 Broken Digital Journeys through a Jupyter Notebook using teradataml (Teradata Python package) and Plotly. The target audience for either on-premises customers and/or Data Scientists.

The nPath function scans a set of rows, looking for patterns that you specify. For each set of input rows that matches the pattern, nPath produces a single output row. The function provides a flexible pattern-matching capability that lets you specify complex patterns in the input data and define the values that are output for each matched input set.

nPath is useful when your goal is to identify the paths that lead to an outcome. For example, you can use nPath to analyze:

  • Web site click data, to identify paths that lead to sales over a specified amount
  • Sensor data from industrial processes, to identify paths to poor product quality
  • Healthcare records of individual patients, to identify paths that indicate that patients are at risk of developing conditions such as heart disease or diabetes
  • Financial data for individuals, to identify paths that provide information about credit or fraud risks
  • The output from the nPath function can be input to other ML Engine functions or to a visualization tool.

    This demo shows how to use Vantage nPath to analyze customer digital behavior, experience, and diagnosis of problems with customer experience and flow. It uses a simulated digital retail data set. Using this data we are conducting path analysis. Specifically, it has examples of looking at the most frequent customer journeys and identifying any common events that could potentially be stopping the customer from making a purchase.

    1. Start by importing the required libraries and connecting to the Vantage system.

    In the section, we import the required libraries and set environment variables and environment paths (if required).

    In [25]:
    %%capture
    # RESTART KERNEL AFTER THIS. NO NEED TO EXECUTE THIS AFTER RESTART.
    !pip install teradataml --upgrade
    !pip install colorlover
    

    ** Restart Kernel before moving to next cell

    Import the libraries mentioned in the below cell.

    In [1]:
    import json
    import getpass
    import os
    import warnings
    import datetime
    from collections import defaultdict
    
    import pandas as pd
    import numpy as np
    
    import teradataml.dataframe.dataframe as tdf
    from teradataml.dataframe.dataframe import in_schema
    from teradataml.context.context import create_context, remove_context, get_context
    from teradataml.dataframe.copy_to import copy_to_sql
    from teradataml.dataframe.fastload import fastload
    
    from teradataml.analytics.sqle.Sessionize import Sessionize
    from teradataml.analytics.sqle.NPath import NPath
    from teradataml.dataframe.dataframe import DataFrame
    
    from teradatasqlalchemy.types import *
    
    from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
    from collections import defaultdict
    import plotly.offline as offline
    import colorlover as cl
    
    offline.init_notebook_mode()
    
    warnings.filterwarnings('ignore')
    warnings.simplefilter(action='ignore', category=FutureWarning)
    warnings.simplefilter(action='ignore', category=DeprecationWarning)
    

    You will be prompted to provide the password. Enter your password, press the Enter key, then use down arrow to go to next cell. Begin running steps with Shift + Enter keys.

    In [2]:
    %run -i ../startup.ipynb
    eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
    print(eng)
    eng.execute('''SET query_band='DEMO=RetailBrokenDigitalJourneysPython.ipynb;' UPDATE FOR SESSION; ''')
    
    ... performing setup
    ... setup complete
    
    ... Logon successful
    Connected as: teradatasql://demo_user:xxxxx@host.docker.internal/dbc
    Engine(teradatasql://demo_user:***@host.docker.internal)
    
    Out[2]:
    <sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f9a1243d3d0>

    2. Getting Data for This Demo

    We have provided data for this demo on cloud storage. You have the option of either running the demo using foreign tables to access the data without using any storage on your environment or downloading the data to local storage which may yield somewhat faster execution, but there could be considerations of available storage. There are two statements in the following cell, and one is commented out. You may switch which mode you choose by changing the comment string.

    In [3]:
    #%run -i ../run_procedure.py "call get_data('DEMO_DigitalEvents_cloud');"
    # Takes about 12 seconds
    %run -i ../run_procedure.py "call get_data('DEMO_DigitalEvents_local');"
    # Takes about 20 seconds
    
    That ran for   0:00:22.98 with 10 statements and 0 errors. 
    

    Next is an optional step – if you want to see status of databases/tables created and space used.

    In [4]:
    %run -i ../run_procedure.py "call space_report();"
    
    You have:  #databases=3 #tables=73 #views=41  You have used 105.6 MB of 25,731.3 MB available - 0.4%  ... Space Usage OK
     
       Database Name                  #tables  #views     Avail MB      Used MB
       demo_user                           72      33  25,701.3 MB      90.5 MB 
       DEMO_DigitalEvents                   0       1       0.0 MB       0.0 MB 
       DEMO_DigitalEvents_db                1       0      30.0 MB      15.0 MB 
       DEMO_Financial                       0       7       0.0 MB       0.0 MB 
    


    5. Analyze the raw data set

    Create a DataFrame to get the data from the table created.

    In [5]:
    df = tdf.DataFrame(in_schema('DEMO_DigitalEvents', 'Digital_Retail_Events'))
    df.head(5)
    
    Out[5]:
    Customer_Id datestamp event session_id
    32 2021-04-17 18:23:29.000000 Availability Status 2
    32 2021-04-21 20:49:32.000000 Home Page 5
    32 2021-04-16 13:29:55.000000 Product Search 1
    32 2021-04-26 05:18:12.000000 Cart Abandoned 8
    32 2021-04-24 12:21:08.000000 Availability Status 7
    In [6]:
    from teradataml import ConvertTo
    converted_data = ConvertTo(data = df,
                               target_columns = ['Customer_Id','datestamp', 'event','session_id'],
                               target_datatype = ["integer","timestamp","VARCHAR(charlen=20,charset=LATIN,casespecific=NO)","integer"])
    converted_data.result
    
    Out[6]:
    Customer_Id datestamp event session_id
    3018 2021-05-05 23:14:47.500000 Product Search 1
    28304 2021-05-14 14:34:10.000000 Search Not Found 1
    28304 2021-05-13 10:23:11.000000 Product Search 0
    19719 2021-05-09 23:20:24.000000 Category Search 4
    19719 2021-05-09 23:19:24.000000 Home Page 4
    8585 2021-05-09 21:03:22.000000 Home Page 3
    8585 2021-05-07 00:22:04.500000 Product Search 1
    8585 2021-05-09 20:56:27.000000 Home Page 3
    19719 2021-05-07 00:49:47.000000 Home Page 2
    28304 2021-05-16 20:45:18.500000 Product Search 3


    6. Identifying Broken Digital Journey

    This section will walkthrough conducting path analysis using retail data. Specifically, it has examples of looking at the most frequent customer journeys and identifying any common events that could potentially be stopping the customer from making a purchase.

    6.1. Sankey Charts

    Sankey charts can help visualize pathways and volume of events on the most common paths. Teradata VantageCloud does have built-in visualizations, but if users don't have access to these capabilities, or perhaps there is a preference to work in Python, the Plotly package provides Sankey charting capabilities. Below the Python function reads output from nPath as pandas dataframe and plots Sankey chart. This method will be used throughout for plotting Sankey.

    In [7]:
    #Convert Teradata nPath output to plotly Sankey
    #can handle paths up to 999 links in length
    import plotly.graph_objects as go
    def sankeyPlot(res, title_text="Basic Sankey Path"):
        npath_pandas = res
        
        dataDict = defaultdict(int)
        eventDict = defaultdict(int)
        maxPath = npath_pandas['count_customer_id'].max()
        
        for index, row in npath_pandas.iterrows():
            rowList = row['path'].replace('[','').replace(']','').split(',')
            pathCnt = row['count_customer_id']
            pathLen = len(rowList)
            for i in range(len(rowList)-1):
                leftValue = str(100 + i + maxPath - pathLen) + rowList[i].strip()
                rightValue = str(100 + i + 1 + maxPath - pathLen) + rowList[i+1].strip()
                valuePair = leftValue + '+' + rightValue
                dataDict[valuePair] += pathCnt
                eventDict[leftValue] += 1
                eventDict[rightValue] += 1
        
        eventList = []
        for key,val in eventDict.items():
            eventList.append(key)
        
        sortedEventList = sorted(eventList)
        sankeyLabel = []
        for event in sortedEventList:
            sankeyLabel.append(event[3:])
        
        sankeySource = []
        sankeyTarget = []
        sankeyValue = []
    
        for key,val in dataDict.items():
            sankeySource.append(sortedEventList.index(key.split('+')[0]))
            sankeyTarget.append(sortedEventList.index(key.split('+')[1]))
            sankeyValue.append(val)
        
        sankeyColor = []
        for i in sankeyLabel:
            sankeyColor.append('blue')
        
        sankeyChart = dict(
            type='sankey',
            node = dict(
              pad = 15,
              thickness = 20,
              line = dict(
                color = 'black',
                width = 0.5
              ),
              label = sankeyLabel,
              color = sankeyColor
            ),
            link = dict(
                source = sankeySource,
                target = sankeyTarget,
                value = sankeyValue
            )
          )
        layout =  dict(
            title = title_text,
            font = dict(
              size = 10
            )
        )
        
        link = dict(source = sankeySource, target = sankeyTarget, value = sankeyValue, color='white')
        node = dict(label=sankeyLabel, pad=15, thickness=20, color='orange')
        data=go.Sankey(link=link, node=node)
        # plot 
        fig=go.Figure(data)
        fig.update_layout(
                hovermode ='closest',
                title = title_text,
                title_font_size=20,
                font = dict(size = 10, color = 'white'),
                plot_bgcolor='black',
                paper_bgcolor="#585958"
                )
        fig.show()  
    
        
        # fig = dict(data=[sankeyChart], layout=layout)
        # iplot(fig, validate=False)
    

    6.2 Calling a basic nPath function.

  • The Vantage nPath analytic function scans a set of rows, looking for patterns.
  • For each set of input rows that matches the pattern, nPath produces a single output row.
  • This function allows for matching of complex patterns in the input data, as well as defining the output values for each matched set of rows.

    For the below example:

    1. Pass the input data by reference.

    2. Provide partitioning (customer_id, session_id) and ordering columns.

    3. Mode.**OVERLAPPING** vs. **NONOVERLAPPING**

  • **OVERLAPPING** finds every occurrence of the match, regardless of the current row being part of a previous match.
  • **NONOVERLAPPING** starts matching again at the row that follows the previous match.
  • 4. Symbols. Create a set of column expression aliases that can be assembled into a pattern to match.

  • Example: "EVENT = 'Home Page' as P" will alias a match on the EVENT column when the content equals 'Home Page'.
  • 5. Pattern. Compose a pattern to search for across the rows of events. This pattern is composed of Symbols and directives.

  • Example: '^P' uses a directive ^ to indicate the P Symbol must occur at the beginning of the group of rows.
  • 6. Result. Since nPath emits a single row per group-of-row matches, Result indicates what columns make up this row and how to aggregate the data.

    6.3 Path to Sales Conversion

    Let's start the analysis by taking a look at what a successful path looks like. This is accomplished by running the Vantage nPath analytic function and specifying the pattern as from Any Event to Sales Conversion.

    In [8]:
    copy_to_sql(converted_data.result, table_name = 'npath_data', schema_name = 'demo_user', 
                if_exists = 'replace')
    
    In [9]:
    #Create two symbols and assemble them with directives:
    # 1. EVENT Column match the string 'Sales Conversion' as B and all other EVENT as O
    # Pattern directs a range of any row (O) between 1 and 4 times preceding 'Sales Conversion' (B) - O{1,4}.B
    
    npath_sessions = NPath(data1 = DataFrame('"demo_user"."npath_data"'), 
                          data1_partition_column = ['Customer_Id','session_id'], 
                          data1_order_column = 'datestamp', 
                          mode = 'NONOVERLAPPING', 
                          symbols = ['event in (\'Sales Conversion\') as B', 'event not in (\'Sales Conversion\') as O'], 
                          pattern = 'O{1,4}.B', 
                          result = ['FIRST (customer_id OF O) AS customer_id',
                                    'ACCUMULATE (event of any(O,B) ) AS path'])
    
    npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .to_pandas()\
                        .head(10)
    
    Out[9]:
    path count_customer_id
    0 [Add To Cart, Sales Conversion] 35
    1 [Home Page, Product Search, Product Details, A... 34
    2 [Product Search, Home Page, Sales Conversion] 30
    3 [Home Page, Availability Status, Sales Convers... 23
    4 [Product Search, Sales Conversion] 20
    5 [Add To Cart, Product Search, Sales Conversion] 18
    6 [Product Search, Add To Cart, Sales Conversion] 17
    7 [Home Page, Sales Conversion] 17
    8 [Home Page, Product Search, Sales Conversion] 16
    9 [Product Search, Product Details, Add To Cart,... 11

    Path to Sales Conversion Visualization

    In the table view you can see that the most common last step before Sales Conversion is Add to Cart. Visualizing the paths on a Sankey diagram is an effective way to see common digital customer journey patterns.

    In [11]:
    #warnings.simplefilter(action='ignore', category=DeprecationWarning)
    res = npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .head(50)\
                        .to_pandas()
    sankeyPlot(res,"Path to Sales Conversion")
    

    The TOP 3 events that led to Sales Conversion are Add To Cart ,Product Search and Availability Status.

    To check the details of any path or node we can move the mouse pointer over it and check details. For example if you mouse the pointer over the path having the largest width going towards the right most node(Sales Conversion) it shows 55.0, source: Add To Cart, target: Sales Conversion. It means there were 55 events where after adding the product Cart the next step was that the product was bought, which led to Sales.

    When the pointer is moved over a Node, for example when the pointer is on the last Node at the right Availability Status it shows incoming flow count: 4 and outgoing flow count: 1 which means that there are 4 different events which lead to Availability Status after which the next event is Sales Conversion.

    6.4 Path to Cart Abandoned

    The data available tracks customers who abandon their carts, change the Ending Pattern and see if you can learn anything from the common paths to the Cart Abandoned event.

    In [12]:
    #Create two symbols and assemble them with directives:
    # 1. EVENT Column match the string 'Cart Abandoned' as B and all other EVENT as O
    # Pattern directs a range of any row (O) between 1 and 4 times preceding 'Cart Abandoned' (B) - O{1,4}.B
    
    npath_sessions = NPath(data1 = DataFrame('"demo_user"."npath_data"'), 
                          data1_partition_column = ['Customer_Id','session_id'], 
                          data1_order_column = 'datestamp', 
                          mode = 'NONOVERLAPPING', 
                          symbols = ['event in (\'Cart Abandoned\') as B', 'event not in (\'Cart Abandoned\') as O'], 
                          pattern = 'O{1,4}.B', 
                          result = ['FIRST (customer_id OF O) AS customer_id',
                                    'ACCUMULATE (event of any(O,B) ) AS path'])
    
    npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .to_pandas()\
                        .head(10)
    
    Out[12]:
    path count_customer_id
    0 [Home Page, Product Search, Search Not Found, ... 538
    1 [Product Search, Home Page, Search Not Found, ... 225
    2 [Product Search, Search Not Found, Cart Abando... 158
    3 [Home Page, Search Not Found, Cart Abandoned] 94
    4 [Availability Status, Home Page, Product Searc... 85
    5 [Availability Status, Product Search, Home Pag... 69
    6 [Home Page, Category Search, Product Search, S... 64
    7 [Home Page, Product Search, Search Not Found, ... 62
    8 [Search Not Found, Home Page, Product Search, ... 58
    9 [Home Page, Cart Abandoned] 50
    In [13]:
    res = npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .head(50)\
                        .to_pandas()
    sankeyPlot(res,"Path to Cart Abandoned")
    

    The Cart was abondoned by most of the customers of the Search Not Found event occurred.

    6.5 Path to Search Not Found

    As you can see from the common paths to Cart Abandoned - Search Not Found is experienced by many customers. Explore this further by changing the ending pattern to Search Not Found.

    In [14]:
    #Create two symbols and assemble them with directives:
    # 1. EVENT Column match the string 'Search Not Found' as B and all other EVENT as O
    # Pattern directs a range of any row (O) between 1 and 4 times preceding 'Search Not Found' (B) - O{1,4}.B
    
    npath_sessions = NPath(data1 = DataFrame('"demo_user"."npath_data"'), 
                          data1_partition_column = ['Customer_Id','session_id'], 
                          data1_order_column = 'datestamp', 
                          mode = 'NONOVERLAPPING', 
                          symbols = ['event in (\'Search Not Found\') as B', 'event not in (\'Search Not Found\') as O'], 
                          pattern = 'O{1,4}.B', 
                          result = ['FIRST (customer_id OF O) AS customer_id',
                                    'ACCUMULATE (event of any(O,B) ) AS path'])
    
    npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .to_pandas()\
                        .head(10)
    
    Out[14]:
    path count_customer_id
    0 [Home Page, Product Search, Search Not Found] 3175
    1 [Product Search, Home Page, Search Not Found] 2017
    2 [Product Search, Search Not Found] 973
    3 [Home Page, Search Not Found] 934
    4 [Home Page, Product Search, Home Page, Search ... 506
    5 [Home Page, Home Page, Product Search, Search ... 491
    6 [Home Page, Product Search, Home Page, Product... 339
    7 [Home Page, Product Search, Product Search, Ho... 234
    8 [Product Search, Home Page, Home Page, Product... 225
    9 [Product Search, Home Page, Product Search, Ho... 219
    In [15]:
    res = npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .head(50)\
                        .to_pandas()
    
    sankeyPlot(res,"Path to Search Not Found")
    

    The Search Not Found event occurred mainly after Product Search from the Home Page.

    6.6 Path From Search Not Found

    With a significant number of paths leading to Search Not found, as well as the number of customers on these paths, we need to explore more on this.

    Now change to swap have 'Search Not Found' as the starting event and see where customers go to after this key event

    In [16]:
    #Create two symbols and assemble them with directives:
    # 1. EVENT Column match the string 'Search Not Found' as A and all other EVENT as O
    # Pattern directs a range of any row (O) between 1 and 4 times succeeding 'Search Not Found' (A) - A.O{1,4}
    
    npath_sessions = NPath(data1 = DataFrame('"demo_user"."npath_data"'), 
                          data1_partition_column = ['Customer_Id','session_id'], 
                          data1_order_column = 'datestamp', 
                          mode = 'NONOVERLAPPING', 
                          symbols = ['event in (\'Search Not Found\') as A', 'true as O'], 
                          pattern = 'A.O{1,4}', 
                          result = ['FIRST (customer_id OF O) AS customer_id',
                                    'ACCUMULATE (event of any(A,O) ) AS path'])
    
    npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .to_pandas()\
                        .head(10)
    
    Out[16]:
    path count_customer_id
    0 [Search Not Found, Home Page] 940
    1 [Search Not Found, Category Search] 808
    2 [Search Not Found, Cart Abandoned] 748
    3 [Search Not Found, Home Page, Product Search] 609
    4 [Search Not Found, Product Search, Home Page] 507
    5 [Search Not Found, Product Details] 248
    6 [Search Not Found, Product Search] 226
    7 [Search Not Found, Home Page, Category Search] 196
    8 [Search Not Found, Cart Abandoned, Home Page] 142
    9 [Search Not Found, Home Page, Product Search, ... 119
    In [17]:
    res = npath_sessions.result\
                        .groupby(['path'])\
                        .count()\
                        .sort('count_customer_id',ascending=False)\
                        .head(50)\
                        .to_pandas()
    
    sankeyPlot(res,"Path From Search Not Found")
    

    The events that were followed by Search Not Found were mainly Cart Abandoned and Availability Status.

    6.7 Storing Customers on a Path

    With the above analysis, it can now be confirmed that there is definitely an issue with unsuccessful searches. From here business actions could be taken such as:

  • Raising a defect for our Website Development team to focus on and with the data supporting this, it should be a high priority!
  • Save a list of the customers on the dominant path so that additional analysis could be performed and/or so that the Customer Care team can proactively reach out to the customers.
  • The next section will demonstrate how to save a list of customers on any path of interest.

    In [18]:
    #Create symbols for each event in our desired path and assemble them with directives:
    # 1. EVENT Column match the string 'Cart Abandoned' as A
    # 2. EVENT Column match the string 'Search Not Found' as B
    # 3. EVENT Column match the string 'Product Search' as C
    # 4. EVENT Column match the string 'Home Page' as D
    # 5. EVENT Column match the string 'Search Not Found' as E
    # Pattern directs events in a particular order which in this case is E.D.C.B.A
    
    npath_sessions = NPath(data1 = DataFrame('"demo_user"."npath_data"'), 
                          data1_partition_column = ['Customer_Id','session_id'], 
                          data1_order_column = 'datestamp', 
                          mode = 'NONOVERLAPPING', 
                          symbols = ['event = \'Search Not Found\' as E', 
                                     'event = \'Home Page\' as D', 
                                     'event = \'Product Search\' as C', 
                                     'event = \'Search Not Found\' as B', 
                                     'event = \'Cart Abandoned\' as A'], 
                          pattern = 'E.D.C.B.A', 
                          result = ['FIRST (customer_id of ANY (E,D,C,B,A)) AS customer_id',
                                    'ACCUMULATE (event of ANY (E,D,C,B,A)) AS path'])
    
    npath_sessions.result.head()
    
    Out[18]:
    customer_id path
    1407 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1988 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    2522 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    3326 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    6617 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    6981 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    3507 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1578 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1185 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1149 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    In [19]:
    # Now we will take the list of distinct customers who have taken this path and 
    # store it in Vantage table for detailed analysis
    
    res = npath_sessions.result\
                  .groupby(['customer_id','path'])\
                  .count()\
                  .sort('count_path',ascending=False)\
                  .to_pandas()\
                  .reset_index()
    
    # Writing clean_input dataframe into MEDIQAN_TEMP_INPUT_TABLE table in Vantage
    copy_to_sql(df = res[['customer_id','path']], 
                table_name = 'CustomersOnPath', 
                #schema_name = 'target_db', # Uncomment & specify if not using the Vantage Live Default Database, which usually is your QLID.
                if_exists="replace")
    
    In [20]:
    result_df = tdf.DataFrame('CustomersOnPath')
    print("Number of records in CustomersOnPath : "+str(result_df.shape[0]))
    print("\n")
    print("Sample records: \n")
    result_df.head(5)
    
    Number of records in CustomersOnPath : 56
    
    
    Sample records: 
    
    
    Out[20]:
    customer_id path
    1407 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1988 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1578 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1185 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]
    1149 [Search Not Found, Home Page, Product Search, Search Not Found, Cart Abandoned]

    The table contains both the customer (Customer_id) and Path (the dominant path selected)

    This saved segment can be used as input for further analysis, such as clustering to see if there are any commonalities across the customers, products searched or potentially as input into a customer care process to reach out to the customers to acknowledge their bad experience with our online store.


    10. Cleanup

    Work Tables

    In [21]:
    eng.execute('DROP TABLE CustomersOnPath;')
    
    Out[21]:
    <sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f99dd902940>
    In [22]:
    eng.execute('DROP TABLE npath_data;')
    
    Out[22]:
    <sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f99ddb366d0>

    Databases and Tables

    The following code will clean up tables and databases created above.

    In [23]:
    %run -i ../run_procedure.py "call remove_data('DEMO_DigitalEvents');" 
    #Takes 3 seconds
    
    Removed objects related to DEMO_DigitalEvents. That ran for 0:00:02.73
    
    In [24]:
    # Removing the connection and clearing session
    remove_context()
    
    Out[24]:
    True


    Conclusion

    As you can see from this brief demonstration, the Vantage nPath Analytic function is an effective way to conduct path analysis, such as the one we just viewed. Path analysis can span multiple topics and crosses industries.

      Examples include:
    • Customer paths to complaints
    • Paths to part failure
    • Paths to Disease
    • Path to Churn

    Links

  • Teradata Python Package User Guide
  • Teradataml Python Reference
  • Teradata nPath Function Reference:
  • Teradata Sessionize Function Reference
  • Python Pandas Reference
  • Plotly Reference
  • Copyright © Teradata Corporation - 2023. All Rights Reserved.